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Abstract 

We study the problem of computing the minimum vertex cover on fc-uniform fc-partite hypergraphs 
when the fc-partition is given. On bipartite graphs {k = 2), the minimum vertex cover can be computed in 
polynomial time. For general k, the problem was studied by Lovasz ||231 . who gave a -I -approximation 
based on the standard LP relaxation. Subsequent work by Aharoni, Holzman and Krivelevich fl] showed 
, a tight integrality gap of (-1 — o(l)) for the LP relaxation. While this problem was known to be NP-hard 

I for fc > 3, the first non-trivial NP-hardness of approximation factor of ^ — e was shown in a recent work 

■ by Guruswami and Saket [13|. They also showed that assuming Khot's Unique Games Conjecture yields 

O I a I — e inapproximability for this problem, implying the optimality of Lovasz's result. 

In this work, we show that this problem is NP-hard to approximate within f ^ 1 + 57: ^ This 
hardness factor is off from the optimal by an additive constant of at most 1 for fc > 4. Our reduction 
^ ' relies on the Multi-Layered PCP of jSl and uses a gadget - based on biased Long Codes - adapted 

from the LP integrality gap of H]. The nature of our reduction requires the analysis of several Long 
Codes with different biases, for which we prove structural properties of the so called cross-intersecting 
collections of set families - variants of which have been studied in extremal set theory. 



in 



o 



1 Introduction 



A fc-unifomi hypergraph G = {V,E) consists of a set of vertices V and a collection of hyperedges E such 
^ I that each hyperedge contains exactly k vertices. A vertex cover for G is a subset of vertices V C y such 

H ■ that every hyperedge e contains at least one vertex from V i.e. e H V / 0. Equivalently, a vertex cover is a 

- - - hitting set for the collection of hyperedges E. The complement of a vertex cover is called an Independent 

Set, which is a subset of vertices X such that no hyperedge e G £J is contained inside T i.e. e ^ X. 

The /c-HypVC problem is to compute the minimum vertex cover in a fc-uniform hypergraph G. It is an 
extremely well studied combinatorial optimization problem, especially on graphs {k = 2), and is known 
to be NP-hard. Indeed, the minimum vertex cover problem on graphs was one of Kaip's original 21 NP- 
complete problems il9J . On the other hand, the simple greedy algorithm that picks a maximal collection of 
disjoint hyperedges and includes all vertices in the edges in the vertex cover gives a fc-approximation, which 
is also obtained by the standard LP relaxation of the problem. The best algorithms known today achieve 
only a marginally better approximation factor of (1 — o(l))fc |[T8l[T5l . 

On the intractability side, there have been several results. For the case k = 2, Dinur and Safra ||9l obtained an 
NP-hardness of approximation factor of 1.36, improving on a | — e hardness by Hastad |[T4l . For general k 
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a sequence of successive works yielded improved NP-hardness factors: fl{k^/^^) by Trevisan |[27l : f2(/c^~^) 
by Holmerin llT6l ; fc — 3 — e by Dinur, Guruswami and Khot fTl ; and the currently best k — I — e due to 
Dinur, Guruswami, Khot and Regev [8]. In | 8], the authors build upon [Tj and the work of Dinur and Safra 
|[9ll . Moreover, assuming Khot's Unique Games Conjecture (UGC) ll20l . Khot and Regev f2T1l showed an 
essentially optimal k — e inapproximability. This result was further strengthened in different directions by 
Austrin, Khot and Safra IH and by Bansal and Khot 

Vertex Cover on k-uniform k-partite Hypergraphs 

In this paper we study the minimum vertex problem on A;-partite fc-uniform hypergraphs, when the under- 
lying partition is given. We denote this problem as fc-HYPVC-PARTlTE. This is an interesting problem 
in itself and its variants have been studied for applications related to databases such as distributed data 
mining ifTOl . schema mapping discovery [ITT] and optimization of finite automata lITTl . On bipartite graphs 
{k = 2), by Koenig's Theorem computing the minimum vertex cover is equivalent to computing the maxi- 
mum matching which can be done efficiently. For general k, the problem was studied by Lovasz who, in his 
doctoral thesis ll23l . proved the following upper bound. 

Theorem 1.1 (Lovasz 1231 ) For every k-partite k-uniform hypergraph G: vc(G)/lp(G) < k/2, where 
vc(G) denotes the size of the minimum vertex cover and LP(G) denotes the value of the standard LP 
relaxation. This yields an efficient k/2 approximation for A:-HypVC-Partite. 

The above upper bound was shown to be tight by Aharoni, Holzman and Krivelevich [U who proved the 
following theorem. 

Theorem 1.2 (Aharoni et al.|[l]|) For every k >?>, there exists a family of k-partite k-uniform hypergraphs 
G such that vc(G)/lp(G) > k/2 — o(l). Thus, the integrality gap of the standard LP relaxation is 
k/2 - o(l). 

A proof of the above theorem describing the integrality gap construction is included in Section [A] The 
problem was shown to be APX-hard in IOtI and lITTl for A; = 3 which can be extended easily to /c > 3. 
A recent work of Guruswami and Saket llT3l showed the following non-trivial hardness of approximation 
factor for general k. 

Theorem 1.3 (Guruswami and Saket |Il3l) For any e > and k > 5, A;-HypVC-Partite is NP-hard to 
approximate within a factor of j — e. Assuming the UGC yields an optimal hardness factor of ^ — efor 
k>3. 

Our Contribution. We show a nearly optimal NP-hardness result for approximating /c-HypVC-Partite. 

Theorem 1.4 For any e > and integer k > 4, it is NP-hard to approximate the minimum vertex cover on 
k-partite k-uniform hypergraphs within to a factor of^ — l + ^ — e. 

Our result significantly improves on the NP-hardness factor obtained in |[T3l and is off by at most an additive 
constant of 1 from the optimal for any k > A. The next few paragraphs give an overview of the techniques 
used in this work. 

Techniques. It is helpful to first briefly review the hardness reduction of HI for /c-HypVC which begins 
with the construction of a new Multi-Layered PCP. This is a two variable CSP consisting of several layers 
of variables, and constraints between the variables of each pair of layers. The work of ||8] shows that it is 
NP-hard to find a labeling to the variables which satisfies a small fraction of the constraints between any 
two layers, even if there is a labeling that satisfies all the constraints of the instance. The reduction to a 
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fc-unifomi hypergraph (as an instance of fc-HYPVC) involves replacing each variable of the PCP with a 
biased Long Code, defined in 111, where the bias depends on k. 

The starting point for our hardness reduction for /c-HypVC-Partite is - as in lUl - the Multi-Layered PCP. 
While we do not explicitly construct a standalone Long Code based gadget, our reduction can be thought of 
as adapting the integrality gap construction of Aharoni et al. HI into a Long Code based gadget in a manner 
that preserves the fc-uniformity and /c-partiteness of the integrality gap. 

Such transformations of integrality gaps into Long Code based gadgets have recently been studied in the 
works of Raghavendra [i25il and Kumar, Manokaran, Tulsiani and Vishnoi ll22l which show this for a wide 
class of CSPs and their appropriate LP and SDP integrality gaps. These Long Code based gadgets can be 
combined with a Unique Games instance to yield tight UGC based hardness results, where the reduction 
is analyzed via the Mossel's Invariance Principle |[24l . Indeed, for A;-HypVC -Partite the work of Gu- 
ruswami and Saket |[T3l combines the integrality gap of |[T] with (a slight modification) of the approach of 
Kumar et al. ll22l to obtain an optimal UGC based hardness result. 

Our reduction, on the other hand, combines Long Codes with the Multi-Layered PCP instead of Unique 
Games and so we cannot adopt a Invariance Principle based analysis. Thus, in a flavor similar to that of |[8l|, 
our analysis is via extremal combinatorics. However, our gadget involves several biased Long Codes with 
different biases and each hyperedge includes vertices from different Long Codes, unlike the constmction in 
iSl . For our analysis, we use structural properties of a cross-intersecting collection of set families. A collec- 
tion of set families is cross-intersecting if any intersection of subsets - each chosen from a different family 
- is large. Variants of this notion have previously been studied in extremal set theory, see for example El- 
We prove an upper bound on the measure of the smallest family in such a collection. This enables a small 
vertex cover (in the hypergraph of our reduction) to be decoded into a good labeling to the Multi-Layered 
PCP 

The next section defines and analyzes the above mentioned cross-intersecting set families. Section |3]defines 
the Multi-Layered PCP of Dinur et al. |i8| and states their hardness for it. In Section 5] we describe our 
reduction and prove Theorem 11.41 



2 Cross-Intersecting Set Families 

We use the notation [n] = {1, . . . , n} and 2["] = {F \ F C [n]}. We begin by defining cross-intersecting 
set families: 

Definition 2.1 A collection of k families , . . . , J-^ C 2["1, is called k-wise t-cross-intersecting if for every 
choice of sets Fi G Fifor i = 1, . . . , k, we have |Fi n . . . n > t. 

We will work with the p-biased measure on the subsets of [n], which is defined as follows: 

Definition 2.2 Given a bias parameter Q < p < 1, we define the measure jip on the subsets of [n] as: 
fJ,p{F) := pl'^l • (1 — . The measure of a family T is defined as /ip(J^) = YlpeT l^p(F)- 

Now, we introduce an important technique for analyzing cross-intersecting families - the shift operation 
(see Def 4.1, pg. 1298 [12]). Given a family F, define the (i, j)-shift as follows: 

S^.(F) = l (FU{i}\{j}) ifjeF,i^Fmd{FU{z}\{j})^F 
^-^ \ F othei-wise. 

Let the {i, j)-shift of a family F be Sij (F) = {S[j (F) \ F e F}. Given a family J" C 21"] , we repeatedly 
apply {i, j)-shift for 1 < i < j < n to until we obtain a family that is invariant under these shifts. Such a 
family is called a left-shifted family and we will denote it by S{F). 
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The following observations about left-shifted families follow from the definition. 

Observation 2.3 Let T C 2["] he a left-shifted family. Consider F ^ T such that i ^ F and j €z F where 
i < j. Then, (F U {i}\{j}) must be in T. 

Observation 2.4 Given T C 2["1, there is a bijection between the sets in T and S{F) that preserves the size 
of the set. Thus, for any fixed p, the measures of T and S{T) are the same under fj,p i.e. fJ-piF) = pp{S{J^)). 

The following lemma shows that the cross-intersecting property is preserved under left-shifting. 

Lemma 2.5 Consider families Fi,. . . ,Fk ^ 2["] that are k-wise t-cross-intersecting. Then, the families 
S{Fi), . . . , S{Fk) are also k-wise t-cross-intersecting. 

Proof: Given the assumption, we will prove that Sij{Fi), . . . , Sij{Fk) are fc-wise t-cross-intersecting. A 
simple induction would then imply the statement of the lemma. 

Consider arbitrary sets € J^j . By our assumption, | Fi n . . . H -F^ | > t. It suffices to prove that 1 5^^^ {Fi ) n 
. . . n Sf-' {Fk)\ > t. If J ^ Fi n . . . n Fk, the claim is true since the only element being deleted is j. Thus, 
for all ^ G [/c], j G F^. If for all / € [k], Sf^ (Fi) = Fi, the claim is trivial. Thus, let us assume wlog that 
S-Ji (Fi) / Fi. Thus, i ^ Fi and hence i ^ Fi n . . . n F^. Now, if i G 5j (Fi) n . . . n S^" (F^), we get 
that j is replaced by i in the intersection and we are done. Thus, we can assume wlog that i ^ Sf^{F2). 
This implies that i ^ F2 and F2 U {i}\{j} G F2. Now consider Fi n (F2 U {i}\{j}) n F3 n . . . n F^. Since 
we are picking one set from each Fi, it must have at least t elements, but this intersection does not contain 
j and hence it is a subset of Sf-" (Fi) n . . . n Sf/{Fk), implying that 1 5^^ (Fi ) n . . . n S'J'^ (F^ ) | > t . ■ 

Next, we prove a key structural lemma about cross-intersecting families which states that for at least one of 
the families, all of its subsets have a dense prefix. 

Lemma 2.6 Let qi, . . . ,qk G (0,1) be k numbers such that Y^- Qi > 1 and let Fi, . . . ,Fk C 2'"] be left- 
shifted families that are k-wise t-cross-intersecting for some t > 1. Then, there exists a j € [k] such that for 
all sets F G Fj, there exists a positive integer rp < n — t such that |F n [t + rp] | > (1 — qi){t + rp). 

Proof: Let us assume to the contrary that for every i G [k], there exists a set Fj G Fi such that for all 
> 0, |Fj n [t + r]| < (1 — qi){t + r). The following combinatorial argument shows that the families Fi 
cannot be A;-wise t-cross-intersecting. 

Let us construct an arrangement of balls and bins where each ball is colored with one of k colors. Create n 
bins labeled 1, . . . , n. For each i and for every x G [n]\Fj, we place a ball with color i in the bin labeled x. 
Note that a bin can have several balls, but they must have distinct colors. Given such an arrangement, we 
can recover the sets it represents by defining F? to be the set of bins that contain a ball with color i. 
Our initial assumption implies that |FP n [t + r] | > qi{t + r). Thus, there are at least [ g.j(t + r) ] balls with 
color i in bins labeled 1, . . . , t + r. The total number of balls in bins labeled 1, . . . , t + r is, 

J^li^'^nft + r]! > ^[g,(t + r)l > ^g,(t + r) > (t + r) > r + 1, 

i=l i=l 4=1 

where the last two inequalities follow using > 1 and t > 1. 

Next, we describe a procedure to manipulate the above arrangement of balls. 
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for r := to n — i 

if bin t + r is empty 

then if a bin labeled from 1 to t — 1 contains a ball then move it to bin t + r 

else if a bin labeled from i to t + r — 1 contains two balls then move one of them to bin t + r 
else output "error" 

We need the following lemma. 

Lemma 2.7 The above procedure satisfies the following properties: 

1. The procedure never outputs error. 

2. At every step, any two balls in the same bin have different colors. 

(r) 

3. At step r, define Gl to be the set of labels of the bins that do not contain a ball of color i. Then, for all 
i G [k], ) G Ti. 

4. After step r, the bins ttot + r have at least one ball each. 



Proof: 1. If it outputs error at step r, there must be at most r — 1 balls in bins 1 to t + r. This is false at 
r = 0. Moreover, at step r' < r, we could have moved a ball only to a bin labeled in [t, t + r]. Thus, we get 
a contradiction. 

2. Note that this is true at r = and a ball is only moved to an empty bin, which proves the claim. 

3. Whenever we move a ball from bin i to j, we have i < j. Since Ti are left-shifted, by repeated application 
of Observation 12.31 we get that at step r, G^f^ G J^. 

4. Since the procedure never outputs eiTor, at step r, if the bin t + r is empty, the procedure places a ball in 
it while not emptying any bin labeled between [t,t + r — 1]. This proves the claim. ■ 

The above lemma implies that at the end of the procedure (after r = n — t), there is a ball in each of the bins 
labeled from [t, n]. Thus, the sets Gi = *^ satisfy HiGi ^ [t — 1] and hence | Gj| < t — I. Also, we 
know that Gi G J^j. Thus, the families Ti cannot be A;-wise t-cross-intersecting. This completes the proof 
of Lemma l2!6l ■ 



The above lemma, along with a Chernoff bound argument, shows that: Given a collection of /c-wise t-cross- 
intersecting families, one of them must have a small measure under an appropriately chosen bias. 

Lemma 2.8 For arbitrary e,5 > 0, there exists some t = O [-^ (log ^ + log (l + ^))) such that the 
following holds: Given k numbers < < 1 such that > 1 and k families, Ti, . . . , J-k ^ 2l"l, that 

are k-wise t-cross-intersecting, there exists a j such that fii_q.^s{^) < 

Proof: First we prove the following lemma derived from the Chernoff bound. 

Lemma 2.9 For arbitrary e,5 > and < q < 1, there exists some t = O i^jj (log ^ + log (l + 2^2'))) 
such that the following holds: 

Any family J-" C 2['^1 that satisfies that for every F E J-, there exists an integer rp > such that \F D [t + 
rp] \ > — q){t + rp) must have ^i^q^s{F) < e. 

Proof: Note that fii^q^s{J^) is equal to the probability that for a random set F chosen according to 
fii-q^s lies in T. Thus, ^i_q_5(J') is bounded by the probability that for a random set F chosen according 
to ^i_-q„5, there exists an rp that satisfies \F n[t + rp] \ > {I — q){t + rp). 
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The Chemoff bound states that for a set of m independent bemouHi random variables Xi, with Pr[Xj 
1] = 1 -q-T, 



Pr 



J^Xi > {l-q)m 

.1=1 



Thus, we get that for any r > 0, Pr[|F n [t + r]| > {I - q){t + r)] < e-2(t+»^)5^ Summing over all r, we 
get that, 



r>0 

Thus, for t = $7 (log ^ + log {^ + ■^)))^ fJ-i-q-si^) will be smaller than e. 



We now continue with the proof of Lemma 1X8] Our t will be dictated by Lemma 1X9] and will be decided 
later. Consider the left-shifted families S{Ti). By Lemma [23] we get that these families are also A:-wise 
t-cross-intersecting. Now, we can apply Lemma |Z6] with the given qi's to conclude that there must exist a j 
such that for all sets F G S{Tj), there exists an rp such that \F n[t + rp] | > (1 — qj){t + rp). 

Now, we can use Lemma |2!9] to conclude that if t is large enough {t = Q (log ^ + log (l + 2^2))) suf- 
fices), then S{Tj) must have measure at most e under the measure ni-g.-s, but this along with Observation 
|2.4| imphes that ^i^q^^^{Tj) < e. ■ 

3 Multi-Layered PCP 

In this section we describe the Multi-Layered PCP constructed in [Si and its useful properties. An instance 
$ of the Multi-Layered PCP is parametrized by integers L,R > 1. The PCP consists of L sets of variables 
Xi, . . . , Xl- The label set (or range) of the variables in the l^^ set Xi is a set where \Rxi \ = R^^^^- 
For any two integers 1 < I < I' < L, the PCP has a set of constraints ^i ii in which each constraint depends 
on one variable x £ Xi and one variable x' S Xi' . The constraint (if it exists) between x £ Xi and x' G Xi' 
(l < I') is denoted and characterized by a projection tt,j.^,j.i : Rx, ^-^ Rx^, ■ A labeling to x and x' satisfies 
the constraint tTx^x' if the projection (via tTx^x') of the label assigned to x coincides with the label assigned 
to x'. 

The following useful 'weak-density' property of the Multi-Layered PCP was defined in lUl. 

Definition 3.1 An instance ^ of the Multi-Layered PCP with L layers is weakly-dense if for any 6 > 0, 

given m > [|] layers h < h < ■ ■ ■ < Im and given any sets Si C Xi.,fori € [m] such that \Si\ > S\Xi.\; 
there always exist two layers k' and U" such that the constraints between the variables in the sets Si' and 
Sill is at least ^fraction of the constraints between the sets Xi., and Xi^„. 

The following inapproximability of the Multi-Layered PCP was proven by Dinur et al. O based on the PCP 
Theorem (E], ^) and Raz's Parallel Repetition Theorem (Il26l). 

Theorem 3.2 There exists a universal constant 7 > such that for any parameters L > 1 and R, there is 
a weakly-dense L-layered PCP $ = U^^i such that it is NP-hard to distinguish between the following two 
cases: 

• YES Case: There exists an assignment of labels to the variables of^ that satisfies all the constraints. 

• NO Case: For every \ < I < V < L, not more that 1 / R^' fraction of the constraints in ^i^ii can be 
satisfied by any assignment. 
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4 Hardness Reduction for HypVC-Partite 



4.1 Construction of the Hypergraph 

Fix a > 3, an arbitrarily small parameter e > and let r = [lOe^^] . We shall construct a (A; + l)-uniform 
{k + l)-paitite hypergraph as an instance of {k + 1)-HypVC-Partite. Our construction will be a reduction 
from an instance $ of the Multi-Layered PCP with number of layers L = 32e~^ and parameter R which 
shall be chosen later to be large enough. It involves creating, for each variable of the PCP, several copies of 
the Long Code endowed with different biased measures as explained below. 

Over any domain T, a Long Code 'H is a collection of all subsets of T, i.e. T-L = 2^. A bias p € [0, 1] 
defines a measure fip on Ti such that fip{v) = p^'"^{l — p)'"^^^' for any v H. In our constmction we 
need several different biased measures defined as follows. For all j = 1, . . . , r, define qj := and biases 
Pj := I — Qj — e. Each pj defines a biased measure /Up over a Long Code over any domain. Next, we define 
the vertices of the hypergraph. 

Vertices. We shall denote the set of vertices by V. Consider a variable x in the layer Xi of the PCP. 
For i G [/c + 1] and j G [r], let 71^ ■ be a Long Code on the domain endowed with the bias ^p^, 

i.e. fipj{v) = — pj)'^^'^"' for all v G Tifj = 2^^i. The set of vertices corresponding to x is 

V[x] := Uil^i Uj=i '^ij- We define the weights on vertices to be proportional to its biased measure in the 
corresponding Long Code. Formally, for any v G Tifj, 

The above conveniently ensures that for any / G [-^^], XlaieX; ^*(^N) ~ 1/-^' ^'^^ S/g[l] SxgXi^^(^M) " 
1. In addition to the vertices for each variable of the PCP, the instance also contains k + 1 dummy vertices 
di, . . . , dk+i each with a very large weight given by wt((ij) := 2 for i G [A; + 1]. Clearly, this ensures 
that the total weight of all the vertices in the hypergraph is 2{k + 1) + 1. As we shall see later, the edges 
shall be defined in such a way that along with these weights would ensure that the maximum sized inde- 
pendent set shall contain all the dummy vertices. Before defining the edges we define the {k + 1) partition 

(yi,...,yfc+i)ofytobe: 

fu U U^?4 UK}, (2) 

for alH = 1, . . . , A; + 1. We now define the hyperedges of the instance. In the rest of the section, the vertices 
shall be thought of as subsets of their respective domains. 

Hyperedges. For every pair of variables x and y of the PCP such that there is a constraint iTx^y, we 
construct edges as follows. 

(1.) Consider all permutations a : [k + l] [k + 1] and sequences (ji, . . . , jfe, jfe+i) such that, ji, . . . ,jk G 
[r] U {0} andifc+i G [r] such that: ^^i ^{nm In ^ ^■ 
(2.) Add all possible hyperedges e such that for all i G [k]: 

(2.a) If j, / then e n V^^i^ =: v,(i) G ^^(^j^. 

(2.b) If ji = then e n V„(,i) = and, 

(2.C) e n =: n,(fc+i) G 'Hl(k+i),j,+,' 
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which satisfy, 



n ■ 



n U 



(3) 



Let us denote the hypergraph constructed above by G{^). From the construction it is clear the G(<I>) is 
{k + l)-partite with partition V = 

Note that the edges are defined in such a way that the set {di, . . . , dfc+i} is an independent set in the 
hypergraph. Moreover, since the weight of each dummy vertex di is 2, while total weight of all except the 
dummy vertices is 1, this implies that any maximum independent set I contains all the dummy vertices. 
Thus, V \ I is a minimum vertex cover that does not contain any dummy vertices. For convenience, the 
analysis of our reduction, presented in the rest of this section, shall focus on the weight of {I Ci V) \ 
{di, . . . , dk+i}- 



4.2 Completeness 

In the completeness case, the instance <I> is a YES instance i.e. there is a labeling A which maps each 
variable x in layer Xi to an assignment in Rx^ for all / = 1, . . . , L, such that all the constraints of ^> are 
satisfied. 

Consider the set of vertices I* which satisfies the following properties: 

(1) di eX* foralH = 1,.. . ,A; + 1. 

(2) For all I € [L], xeXi,i£ [k + 1], j G [r], 

I* n nfj = {ve nfj -. a{x) e v}. (4) 

Suppose X and y are two variables in <I> with a constraint TT^^y between them. Consider any v £ I* nV[x] 
and u G I* H V[y]. The above construction of I* along with the fact that the labeling A satisfies the 
constraint TT^-^y implies that A{x) G v and A{y) G u and A{y) G 7rx^y{v) n u. Therefore, Equation (|3]l 
of the construction is not satisfied by the vertices in I*, and so I* is an independent set in the hypergraph. 
By Equation (01), the fraction of the weight of the Long Code T-L^j which lies in I* is pj, for any variable x, 
i G [A; + 1] and j G [r]. Therefore, 



wt{i*r\V[x]) 



wt{V[x]) . 
by our setting of pj in Section 1411 The above yields that 



(5) 



wt(X*n(y\{di,...,4+i})) = 1-^ (^1 + - e>l-^-2£, (6) 
for a small enough value of e > and our setting of the parameter r. 



4.3 Soundness 

For the soundness analysis we have that $ is a NO instance as given in Theorem 13.21 and we wish to prove 
that the size of the maximum independent set in G(<I>) is appropriately small. For a contradiction, we assume 
that there is a maximum independent set Z in G(<I>) such that, 

wt(X n{V\{di,..., 4+1 })) > 1 - ^7^^ + e. (7) 
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Define the set of variables X' to be as follows: 

X' := Ixa variab e n $ : — \ , |, > 1 + -\ . (8) 

\ wi{V[x\) - 2(fc + 1) 2J ^ ' 

An averaging argument shows that wt(Ua;6X'^[a^]) > e/2. A further averaging implies that there are 
|L = I layers of $ such that | fraction of the variables in each of these layers belong to X' . Applying 
the Weak Density property of <I> given by Definition 13.11 and Theorem 13.21 yields two layers Xy and Xin 
(I' < I") such that |j fraction of the constraints between them are between variables in X'. The rest of the 
analysis shall focus on these two layers and for convenience we shall denote X' n Xi/ by X and X' n Xin 
by Y, and denote the respective label sets by Rx and Ry- 

Consider any variable X € X. Foranyi € [fc+l],j G [r], call a.LongCodeT-Lfj significant if fip^ilnTifj) > 
From Equation dSjl and an averaging argument we obtain that, 
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G [k + 1] X [r] : 7^^. is significant.}] > (^1 - (K^ + 1)) = y + r"- (9) 



Using an analogous argument we obtain a similar statement for every variable y G Y and corresponding 
Long Codes H^-. The following structural lemma follows from the above bound. 

Lemma 4.1 Consider any variable x X. Then there exists a sequence (ji, . . . ,jk+i) with ji G [r] U {0} 
for i G [A; + 1]; such that the Long Codes {'Hfj. \ i £ [k + 1] where ji ^ 0}, are all significant. Moreover, 

k+l 

^j*>Y + r-. (10) 

i=l 

Proof: For all z G [fc + 1] choose ji as follows: if none of the Long Codes T-Lf- for j G [r] are significant 
then let ji := 0, otherwise let ji := max{j G [r] : T-Lf^ is significant}. It is easy to see that jj is an upper 
bound on the number of significant Long Codes of the form T-L^-. Therefore, 

fc+i 

"y^Ji > £[k + I] x[r]:nij is significant.}] > ^ + r (From Equation ©) (11) 

1=1 

which proves the lemma. ■ 
Next we define the decoding procedure to define a label for any given variable x e X. 

4.3.1 Labeling for variable x e X 

The label A{x) for each variable x G X is chosen independently via the following three step (randomized) 
procedure. 

Step 1. Choose a sequence (ji, . . . ,jk+i) yielded by Lemma |4~T] applied to x. 

Step 2. Choose an element io uniformly at random from [A; + 1] . 

Before describing the third step of the procedure we require the following lemma. 

Lemma 4.2 There exist vertices Vi £ I H 'H^j- for every i : i G [k + 1] \ {io} , ji 0, and an integer 
t := t(e) satisfying: 



n 



i:ie[fc+l]\{io}, 



< t. (12) 
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Proof: Since ji^ < r it is easy to see, 

^ J. > y ^ > 1- (13) 

ie[fc+l]\{jo} i:ie[k+l]\{io}, 

Moreover, since the sequence (ji , . . . , j^+i ) was obtained by Lemma l4TT] applied to x, we know that /Xp (Xn 
^fj.) > f > Vz : 2 € [A; + 1] \ {zq}, Jj ^ 0. Combining this with Equation (fT3l ) and Lemma |Z8] we obtain 
that for some integer t := t{e) the collection of set families {'Hfj_ : i G [fc + 1] \ {io},ji 0} is not fc'-wise 
t-cross-intersecting, where k' = \{i ^ [k + 1\ \ {io} : ji / 0}|. This proves the lemma. ■ 

The third step of the labeling procedure is as follows: 

Step 3. Apply Lemma [42] to obtain the the vertices G X n 'Hfj_ for every i : i ^ + 1] \ {^o}i ji 7^ 
satisfying Equation (fT2l) . Define B{x) as, 

B{x):= fl Vi, (14) 

i:ie[k+l]\{iQ}, 

noting that < t. Assign a random label from B{x) to the variable x and call the assigned label A{x). 

4.3.2 Labeling for variable y € F 

After labeling the variables x £ X via the procedure above, we construct a labeling A{y) for any variable 
y G y by defining, 

A{y) := argmax^g^^, \{x e X n N{y) \ a G n,^-,y{B{x))}\ , (15) 

where N{y) is the set of all variables that have a constraint with y. The above process selects a label for y 
which lies in maximum number of projections of B{x) for variables x G X which have a constraint with y. 
The rest of this section is devoted to lower bounding the number of constraints satisfied by the labeling 
process, and thus obtain a contradiction to the fact that $ is a NO instance. 

4.3.3 Lower bounding the number of satisfied constraints 

Fix a variable y £ Y. Let U{y) := X n N{y), i.e. the variables in X which have a constraint with y. 
Further, define the set P{y) ^ [A; + 1] as follows, 

P{y) = {ie[k + l] I 3j G [r] such that (X n H^^-) > e/2}. (16) 

In other words, P{y) is the set of all those indices in [A; + 1] such that there is a significant Long Code cor- 
responding to each of them. Applying Equation (|9ll to y we obtain that there at least ''^^^^^ significant Long 
Codes corresponding to y, and therefore |P(y)| > > 1. Next we define subsets of U (y) depending on 
the outcome of Step 2 in the labehng procedure for variables x £ U{y). For i G [A; + 1] define, 

U {i, y) := {x G U{y) \ i was chosen in Step 2 of the labeling procedure for x}, (17) 

and, 

U*{y):= (j U{i,y). (18) 
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Note that is a partition of U{y). Also, since \P{y)\ > and the labeling procedure for 

each variable x chooses the index in Step 2 uniformly and independently at random we have, 

nu*iy)\] > (19) 

where the expectation is over the random choice of the indices in Step 2 of the labeling procedure for all 
X € U{y). Before continuing we need the following simple lemma (proved as Claim 5.4 in |[8l). 

Lemma 4.3 Let Ai, . . . , be a collection of N sets, each of size at most T > 1. If there are not more 
than D pairwise disjoint sets in the collection, then there is an element that is contained in at least 7^ sets. 

Now consider any i' € P{y) such that U {i' , y) 7^ and a variable x £ U{i', y). Since i' € P{y) there is a 
significant Long Code 'H\,-, for some j' G [r]. Furthermore, since Z is an independent set there cannot be a 
u elnnl j, such that TTx—^y{B{x)) n u — 0, otherwise the following set of /c + 1 vertices, 

{vi I ie[k + i\\ {i'}, / 0} u {di I ie[k + i\\ {i!},ji = 0} u {u} 

form an edge in Z, where Vi,ji (i £ [/c + 1]) are as constructed in the labeling procedure for x. 
Consider the collection of sets TTx^y{B{x)) for all x € C/ (i', y). Clearly each set is of size less than t. Let 
D be the maximum number of disjoint sets in this collection. Each disjoint set independently reduces the 
measure of XrTH^, ■, by a factor of (1 — (1 — pj/)*). However, since /ip^, {ZnT-Lf, ■,) is at least |, this implies 
that D is at most log(|) / log(l — (2/rA;)*), since pji < 1 — Moreover, since t and r depends only on e, 
the upper bound on D also depends only on e. 

Therefore by Lemma 1431 there is an element a € Ry such that a G 7rx-).y{B{x)) for at least fraction of 
X € U{i' ,y). Noting that this bound is independent of / and that {U{i' , y)}i'sp(y) is a partition of U* (y), 
we obtain that there is an element a G Ry such that a S irx->.y{B{x)) for fraction of x G U*{y). 

Therefore, in Step 3 of the labeling procedure when a label A{x) is chosen unifomily at random from 
B{x), in exception, a = -Kx-^y{{A{x)) for fj^j^l-^j-^i fraction of x G U*{y). Combining this with Equation 
( fT9l ) gives us that there is a labeling to the variables in X and Y which satisfies 2{k+i)Df^ fraction of the 

constraints between variables in X and Y which is in turn at least |^ fraction of the constraints between the 
layers Xii and X/// . Since D and t depend only on e, choosing the pai^ameter i? of <1> to be large enough we 
obtain a contradiction to our supposition on the lower bound on the size of the independent set. Therefore 
in the Soundness case, any for any independent set Z, 

wt(x n (y \ {di , . . . , 4+1})) < 1 - ^^^^ + 

Combining the above with Equation Q of the analysis in the Completeness case yields a factor 2{k+i) ~ ^ 

(for any 5 > 0) hardness for approximating {k + 1)-HypVC-Partite . 

Thus, we obtain a factor | — 1 + ^ — (5 hardness for approximating /c-HypVC -PARTITE. 
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A LP Integrality Gap for /c-HypVC-Partite 

This section describes the | — o(l) integrality gap construction of Aharoni et al. HI for the standard LP 
relaxation for /c-HypVC -Partite. The hypergraph that is constructed is unweighted. 

Let r be a (large) positive integer. The vertex set V of the hypergraph is partitioned into subsets Vi, . . . ,Vk 
where, for alH = 1, . . . , A;, 

y^ = {xij I i = 1, . . . , r} U {yu I l = l,...,rk+l}. (20) 

Before we define the hyperedges, for convenience we shall define the LP solution. The LP values of the 
vertices are as given by the function h : V ^ [0,1] as follows: for alH = 1, . . . , A:, 



rk 

h{yii) = 0, yi = l,...,rk + l. 



h{xij) = —, Vj = l,...,r 

rk 



The set of hyperedges is naturally defined to be the set of all possible hyperedges, choosing exactly one 
vertex from each Vi such that the sum of the LP values of the corresponding vertices is at least 1. Formally, 

E = {e<^V I yie[k], |e n Fil = 1 and ^ h{v) > 1}. (21) 

Clearly the graph is fc-unifomi and fc-partite with {Vijjgjfc] being the /c-paitition of V. 
The value of the LP solution is 

EM-) = ^E|=^ + 1- (22) 

veV j<=[r] 

Now let V be a minimum vertex cover in the hypergraph. To lower bound the size of the minimum vertex 
cover, we first note that the set {v £ V \ h{v) > 0} is a vertex cover of size rk, and therefore \V'\ < rk. 
Also, for any i G [k] the vertices {yii}i(z[rk+i] have the same neighborhood. Therefore, we can assume that 
V' has no vertex yu, otherwise it will contain at least rk + I such vertices. 
For all i G [k] let define indices ji € [r] U {0} as follows: 

I max {j € [r] | Xij ^ V'} otherwise. 
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It is easy to see that since V is a vertex cover, 

< 1, 

ielk] 

which imphes, 

Erk 
J. < y 

ielk] 

Also, the size of V is lower bounded by J2ie[k] ~ 3i)- Therefore, 

l^'l > E - J'^) >rk-Y.H>rk-'-^ = ^. (24) 

The above combined with the value of the LP solution yields an integrality gap of 2{l-+i) — I ~ '^(^) ^^'^ 
large enough r. 
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